Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression
نویسندگان
چکیده
Generalized Linear Models (GLMs) and Single Index Models (SIMs) provide powerful generalizations of linear regression, where the target variable is assumed to be a (possibly unknown) 1-dimensional function of a linear predictor. In general, these problems entail non-convex estimation procedures, and, in practice, iterative local search heuristics are often used. Kalai and Sastry (2009) recently provided the first provably efficient method for learning SIMs and GLMs, under the assumptions that the data are in fact generated under a GLM and under certain monotonicity and Lipschitz constraints. However, to obtain provable performance, the method requires a fresh sample every iteration. In this paper, we provide algorithms for learning GLMs and SIMs, which are both computationally and statistically efficient. We also provide an empirical study, demonstrating their feasibility in practice.
منابع مشابه
The Isotron Algorithm: High-Dimensional Isotonic Regression
The Perceptron algorithm elegantly solves binary classification problems that have a margin between positive and negative examples. Isotonic regression (fitting an arbitrary increasing function in one dimension) is also a natural problem with a simple solution. By combining the two, we get a new but very simple algorithm with strong guarantees. Our ISOTRON algorithm provably learns Single Index...
متن کاملLearning Single Index Models in High Dimensions
Single Index Models (SIMs) are simple yet flexible semi-parametric models for classification and regression. Response variables are modeled as a nonlinear, monotonic function of a linear combination of features. Estimation in this context requires learning both the feature weights, and the nonlinear function. While methods have been described to learn SIMs in the low dimensional regime, a metho...
متن کاملSingle-Vehicle Run-Off-Road Crash Prediction Model Associated with Pavement Characteristics
This study aims to evaluate the impact of pavement physical characteristics on the frequency of single-vehicle run-off-road (ROR) crashes in two-lane separated rural highways. In order to achieve this goal and to introduce the most accurate crash prediction model (CPM), authors have tried to develop generalized linear models, including the Poisson regression (PR), negative binomial regression (...
متن کاملA Faster Algorithm Solving a Generalization of Isotonic Median Regression and a Class of Fused Lasso Problems
Many applications in the areas of production, signal processing, economics, bioinformatics, and statistical learning involve a given partial order on certain parameters and a set of noisy observations of the parameters. The goal is to derive estimated values of the parameters that satisfy the partial order that minimize the loss of deviations from the given observed values. A prominent applicat...
متن کاملIsotonic single-index model for high-dimensional database marketing
While database marketers collect vast amounts of customer transaction data, its utilization to improve marketing decisions presents problems. Marketers seek to extract relevant information from large databases by identifying signi6cant variables and prospective customers. In small databases, they could calibrate logistic regression models via maximum-likelihood methods to determine signi6cant v...
متن کامل